1 00:00:11,353 --> 00:00:13,237 LiveTextAccess. 2 00:00:13,494 --> 00:00:16,821 Training for real-time intralingual subtitlers. 3 00:00:18,692 --> 00:00:22,948 This is Unit 1. Understanding accessibility. 4 00:00:23,098 --> 00:00:25,831 Element 1. Basic concepts. 5 00:00:27,345 --> 00:00:31,170 This video lecture focuses on multimodal communication, 6 00:00:31,270 --> 00:00:34,836 which is a specificity of audiovisual translation 7 00:00:34,880 --> 00:00:37,000 and live situations. 8 00:00:37,178 --> 00:00:38,623 My name is Rocío Bernabé 9 00:00:39,016 --> 00:00:42,526 from the Internationale Hochschule SDI München, in Germany. 10 00:00:43,295 --> 00:00:45,122 I have prepared this video lecture 11 00:00:45,225 --> 00:00:47,451 in collaboration with the European Federation 12 00:00:47,638 --> 00:00:50,667 of Hard of Hearing, in short, EFHOH. 13 00:00:52,897 --> 00:00:56,438 On completion of this training sequence, you will be able to explain 14 00:00:56,557 --> 00:00:59,384 the concept of multimodal communication, 15 00:00:59,472 --> 00:01:03,320 and to describe the challenges that real-time subtitlers 16 00:01:03,400 --> 00:01:05,258 and end-users face. 17 00:01:07,544 --> 00:01:09,290 Let's take a look at the agenda. 18 00:01:10,118 --> 00:01:14,118 We start talking about what modes are 19 00:01:14,160 --> 00:01:15,720 in audiovisual translation 20 00:01:15,800 --> 00:01:19,478 and why communication is considered to be multimodal. 21 00:01:20,242 --> 00:01:22,912 Then we discuss multimodality 22 00:01:23,000 --> 00:01:26,247 in the context of real-time intralingual subtitling, 23 00:01:26,574 --> 00:01:31,281 and we also talk about the intricacies of conveying information through a mode 24 00:01:31,381 --> 00:01:33,055 that is not the original one. 25 00:01:34,152 --> 00:01:39,600 The concept of multimodality seems easy to understand at first glance. 26 00:01:40,489 --> 00:01:45,083 In audiovisual translation, scholars such as Jorge Díaz Cintas 27 00:01:45,249 --> 00:01:48,092 classify modes into 2 categories: 28 00:01:48,228 --> 00:01:49,589 audio and video. 29 00:01:50,187 --> 00:01:55,351 Modes help us to classify in which way a specific resource is realised. 30 00:01:55,984 --> 00:01:59,507 In multimodal communication, resources are realised 31 00:01:59,611 --> 00:02:01,533 either visually or aurally. 32 00:02:03,317 --> 00:02:10,134 Sociolinguists and semiotic scholars, such as Halliday, Kress or Van Leeuwen, 33 00:02:10,355 --> 00:02:16,084 explain that there are many different types of resources within a culture. 34 00:02:16,607 --> 00:02:19,681 These resources can be verbal, such as language, 35 00:02:19,766 --> 00:02:24,071 or non-verbal, such as gestures, images, sounds, or objects, 36 00:02:24,824 --> 00:02:27,744 for example clothes or food. 37 00:02:28,980 --> 00:02:33,759 Depending on the type of resource, a speaker can choose the video 38 00:02:33,853 --> 00:02:36,233 or the audio mode for its realization. 39 00:02:36,363 --> 00:02:40,687 For instance, words are a resource that can be realised aurally, 40 00:02:41,105 --> 00:02:42,435 through the audio mode, 41 00:02:43,422 --> 00:02:47,523 and visually, by using subtitles, for example. 42 00:02:49,838 --> 00:02:53,120 When a message is rendered multimodally, 43 00:02:53,200 --> 00:02:56,457 the audience needs to access both channels 44 00:02:56,683 --> 00:02:59,266 to receive the complete message. 45 00:03:00,000 --> 00:03:02,213 However, this is not always the case. 46 00:03:03,412 --> 00:03:08,268 The reasons why one channel may not be available are manifold, 47 00:03:08,320 --> 00:03:12,867 and can range from a noisy environment to a hearing loss. 48 00:03:14,269 --> 00:03:17,644 In such cases, alternatives need to be available. 49 00:03:18,551 --> 00:03:22,433 This is the essence of the work of audiovisual translators 50 00:03:22,611 --> 00:03:25,097 and the purpose of access services. 51 00:03:25,193 --> 00:03:29,042 That is, to provide an alternative way to access the information 52 00:03:29,120 --> 00:03:32,980 that is not reaching the audience through the original channel. 53 00:03:35,592 --> 00:03:41,224 Our job is to enable a diamesic change from one mode to another, 54 00:03:41,353 --> 00:03:43,880 which has been described by Carlo Eugeni 55 00:03:43,984 --> 00:03:45,957 as "diamesic translation". 56 00:03:46,996 --> 00:03:52,946 For instance, dialogues or narrations that are rendered aurally in an original 57 00:03:53,285 --> 00:03:56,658 can be conveyed visually using subtitles. 58 00:03:57,669 --> 00:04:03,240 In real-time subtitling, subtitler generates this visual information 59 00:04:03,320 --> 00:04:04,901 that is then added 60 00:04:05,317 --> 00:04:09,311 to the original resources that were already rendered visually. 61 00:04:10,748 --> 00:04:13,027 This change from one mode to another 62 00:04:13,359 --> 00:04:16,560 includes words and other resources that are necessary 63 00:04:16,640 --> 00:04:18,274 to understand a message. 64 00:04:18,721 --> 00:04:19,546 For example, sounds, 65 00:04:19,768 --> 00:04:23,887 contextual information, and identifying a speaker. 66 00:04:24,471 --> 00:04:28,360 For instance, at a conference, subtitlers may render sounds, 67 00:04:28,440 --> 00:04:31,307 like an "APPLAUSE" after a speech 68 00:04:31,400 --> 00:04:34,215 or a sound to which a speaker may react 69 00:04:34,730 --> 00:04:38,293 such as siren from outside, or someone sneezing, 70 00:04:38,399 --> 00:04:40,795 or a loud bang in another room. 71 00:04:41,904 --> 00:04:46,593 This brings us to the challenges that a real-time subtitler face. 72 00:04:48,358 --> 00:04:50,359 The challenge of multimodality. 73 00:04:51,320 --> 00:04:54,589 The challenges that real-time subtitlers face 74 00:04:54,640 --> 00:04:57,814 in the process of rendering resources visually 75 00:04:57,923 --> 00:05:00,625 emerge from 3 main constraints. 76 00:05:01,293 --> 00:05:05,611 These are a limited amount of time and space for our subtitles, 77 00:05:05,694 --> 00:05:06,844 and latency. 78 00:05:07,133 --> 00:05:10,579 Latency refers to the maximum delay or time 79 00:05:10,672 --> 00:05:14,210 by which subtitles should appear on a screen. 80 00:05:16,545 --> 00:05:21,601 Subtitles should coincide as much as possible with speech onset. 81 00:05:22,088 --> 00:05:25,823 A minimum delay supports understanding and lip-reading, 82 00:05:26,102 --> 00:05:28,102 which is an additional input cue 83 00:05:28,188 --> 00:05:31,885 that persons with hearing loss often use in communication. 84 00:05:32,602 --> 00:05:37,220 Some examples of maximum delay in different contexts are: 85 00:05:37,691 --> 00:05:42,414 6 seconds for TV, 6 to 8 seconds in parliaments, 86 00:05:42,507 --> 00:05:45,141 and 3 seconds at conferences. 87 00:05:47,912 --> 00:05:52,865 These constraints of real-time situations have clear implications for subtitlers, 88 00:05:53,215 --> 00:05:56,888 who will continuously have to choose what resources to render. 89 00:05:57,864 --> 00:06:02,635 These choices are influenced by how well-organised a speaker is, 90 00:06:02,746 --> 00:06:05,477 and how fast he or she speaks, 91 00:06:05,741 --> 00:06:07,795 and by the working context. 92 00:06:08,507 --> 00:06:10,163 Let's see some examples. 93 00:06:11,586 --> 00:06:12,477 In parliaments, 94 00:06:12,570 --> 00:06:16,723 the most important features to be subtitled are, in this order: 95 00:06:17,774 --> 00:06:21,409 speech, which should be as verbatim as possible, 96 00:06:21,533 --> 00:06:25,762 and without features of orality, such as tone or stress. 97 00:06:26,889 --> 00:06:29,080 Then, speaker identification. 98 00:06:29,585 --> 00:06:33,638 This is especially important because words need to belong 99 00:06:33,680 --> 00:06:35,173 to the actual speaker. 100 00:06:35,346 --> 00:06:37,901 Otherwise, diplomatic incidents could occur. 101 00:06:38,805 --> 00:06:40,797 Then, contextual information, 102 00:06:40,887 --> 00:06:44,119 which becomes key when voting takes place. 103 00:06:44,620 --> 00:06:47,432 In voting cases, the other resources 104 00:06:47,542 --> 00:06:51,240 (speech, speaker identification, slides, etc.) 105 00:06:51,407 --> 00:06:53,022 are of less importance. 106 00:06:53,482 --> 00:06:55,788 Lastly, other materials. 107 00:06:56,073 --> 00:06:56,859 In parliaments, 108 00:06:57,029 --> 00:07:00,994 rarely happens that somebody brings things with him or her 109 00:07:01,310 --> 00:07:03,434 such as pictures, or slides. 110 00:07:04,145 --> 00:07:07,726 In most cases, this information is not relevant 111 00:07:07,800 --> 00:07:09,924 and will not be prioritised. 112 00:07:10,964 --> 00:07:13,393 Lastly, an example from conferences. 113 00:07:15,542 --> 00:07:18,514 At conferences, speech is also prioritised 114 00:07:18,624 --> 00:07:20,922 as it is in parliaments. 115 00:07:21,809 --> 00:07:25,259 Identifying a speaker is often less important in conferences 116 00:07:25,359 --> 00:07:28,190 because it is usually quite clear who is speaking, 117 00:07:28,436 --> 00:07:31,738 especially when only one speaker is on stage. 118 00:07:32,362 --> 00:07:35,990 However, identifying a speaker may be relevant 119 00:07:36,000 --> 00:07:39,609 when there is a debate and speakers start to switch. 120 00:07:40,149 --> 00:07:41,000 In these cases, 121 00:07:41,080 --> 00:07:43,874 identifying the speaker becomes more critical, 122 00:07:44,015 --> 00:07:49,155 as subtitlers will have to pay more attention to mentioning the names. 123 00:07:50,452 --> 00:07:54,704 Another case of speaker identification at conferences would be 124 00:07:54,920 --> 00:07:58,437 when an interpreter says something for him or herself. 125 00:07:59,753 --> 00:08:03,468 For example, a simultaneous interpreter may say: 126 00:08:03,712 --> 00:08:05,733 "I cannot hear the speaker". 127 00:08:05,895 --> 00:08:08,875 Or "the microphone is shut off". 128 00:08:09,678 --> 00:08:12,404 In such cases, it is a small challenge 129 00:08:12,509 --> 00:08:16,558 to show in your text, as subtitler, very clearly, 130 00:08:16,897 --> 00:08:19,560 that this is something that the interpreter says, 131 00:08:19,640 --> 00:08:21,869 and not the original speaker. 132 00:08:23,824 --> 00:08:28,920 Sounds, like applause, are often included in subtitles, at conferences, 133 00:08:29,143 --> 00:08:32,957 whereas contextual information such as "irony" is less common 134 00:08:33,000 --> 00:08:35,067 because the interaction is live. 135 00:08:36,080 --> 00:08:38,367 Ok, let's recap now. 136 00:08:39,596 --> 00:08:43,910 Multimodal communication makes communication exciting and complex 137 00:08:43,960 --> 00:08:45,392 at the same time. 138 00:08:45,748 --> 00:08:50,545 Moreover, multimodality often requires a higher effort from both 139 00:08:50,730 --> 00:08:52,855 viewers and subtitlers. 140 00:08:53,475 --> 00:08:54,428 On the one hand, 141 00:08:54,543 --> 00:08:58,241 viewers or end-users will perceive more information 142 00:08:58,361 --> 00:08:59,614 through the visual mode, 143 00:09:00,369 --> 00:09:03,401 and at a pace that is set by the speaker. 144 00:09:04,565 --> 00:09:08,379 On the other, subtitlers continuously have to make choices 145 00:09:08,440 --> 00:09:11,861 about what resources should be rendered and when. 146 00:09:13,042 --> 00:09:14,586 Depending on the context, 147 00:09:14,666 --> 00:09:18,874 this will mean to add information or, conversely, to reduce 148 00:09:18,920 --> 00:09:23,228 or condense the message to provide subtitles in synchrony 149 00:09:23,310 --> 00:09:26,312 with the speech onset with a minimum delay. 150 00:09:27,313 --> 00:09:31,504 You will learn how to do this in Unit 5 and Unit 6 151 00:09:31,616 --> 00:09:36,470 with our colleagues Wim Gerbecks, Carlo Eugeni and Silvia Velardi. 152 00:09:37,050 --> 00:09:38,815 As for now, I say goodbye. 153 00:09:39,592 --> 00:09:40,802 Exercises. 154 00:09:41,804 --> 00:09:45,791 The exercises for this video lecture are in the Trainer’s Guide 155 00:09:45,890 --> 00:09:46,856 for Unit 1 156 00:09:46,944 --> 00:09:49,174 and in the PowerPoint presentation. 157 00:10:00,364 --> 00:10:02,806 LTA - LiveTextAccess. 158 00:10:03,531 --> 00:10:06,081 Universitat Autònoma de Barcelona. 159 00:10:07,152 --> 00:10:10,319 SDI - Internationale Hochschule. 160 00:10:11,454 --> 00:10:15,060 Scuola Superiore per Mediatori Linguistici. 161 00:10:16,109 --> 00:10:17,699 2DFDigital. 162 00:10:18,793 --> 00:10:22,147 The European Federation of Hard of Hearing People – EFHOH. 163 00:10:23,249 --> 00:10:24,397 VELOTYPE. 164 00:10:25,177 --> 00:10:26,548 SUB-TI ACCESS. 165 00:10:27,551 --> 00:10:32,661 European Certification and Qualification Association – ECQA. 166 00:10:35,886 --> 00:10:39,900 Co-funded by the Erasmus+ Programme of the European Union. 167 00:10:41,904 --> 00:10:55,960 Erasmus+ Project: 2018-1-DE01-KA203-004218. 168 00:10:57,240 --> 00:11:00,600 The information and views set on this presentation 169 00:11:00,960 --> 00:11:02,763 are those of the authors 170 00:11:02,920 --> 00:11:06,480 and do not necessarily reflect the official opinion 171 00:11:06,800 --> 00:11:08,120 of the European Union. 172 00:11:09,240 --> 00:11:12,880 Neither the European Union institutions and bodies 173 00:11:13,440 --> 00:11:16,040 nor any person acting on their behalf 174 00:11:16,640 --> 00:11:19,320 may be held responsible for the use 175 00:11:19,680 --> 00:11:23,000 which may be made of the information contained here.